-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LogReader: try zst on internal source #32751
Conversation
def internal_source_zst(sr: SegmentRange, mode: ReadMode, file_ext: str = "zst") -> LogPaths: | ||
return internal_source(sr, mode, file_ext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one or the other doesn't exist, we'll spend ~0.001s extra on one HEAD request to the first log file in the segment range, not bad
For downloading to mkv, the important thing is that people iterating on a big list of segments doesn't unknowingly repeatedly download the data (which costs money and will slow down their work). Maybe we can come up with a different way to manage this.
We probably need to make some changes if this is now a concern. The idea that LogReader can't simply operate on files that can be listed sounds like a problem to me. |
I am in favor of only using the internal source when it has the files LogReader needs if that simplifies things, then find a way to exclusively use the internal source when egress bandwidth cost is a concern. |
* internal source list files like azure api * messy but works * no limit * simpler * clean up * clean up * clean up * that's obvious * better * we need to unfortunately return a url, so best to take a naive approach for now * todo * fix * clean up old-commit-hash: b45caf4
Similar to Azure, which allows any extension log files. Now we discover zst logs/qlogs from internal_source (mkv)Since our current behavior is to have LogReader cause mkv to download every log file we view to the office (useradmin, PlotJuggler, notebooks, etc.), we need to return all potential URLs for bz2 and zst. If we list we still don't always know which will be available.
@gregjhogan do we still want to silently download all logs to the office when viewed with LogReader? This was not the behavior before the refactor, so I think we should be explicit/document it somewhere if we do want this.